The world weather dataset is imorted from Kaggle https://www.kaggle.com/datasets/nelgiriyewithana/global-weather-repository. Weather data has 9937 records of 41 variables. Column names include country, location by name, latitude, longitude and time zone, temperature is recorded in Celcius and Farenheit, weather conditions are recorded as text, wind, pressure, precipitation, humidity and cloud. Additionally air quality index and its related variables, sun and moon rising and setting times were recorded for all the countries for some specific days.
| country | location_name | latitude | longitude | timezone | last_updated_epoch | last_updated | temperature_celsius | temperature_fahrenheit | condition_text | ... | air_quality_PM2.5 | air_quality_PM10 | air_quality_us-epa-index | air_quality_gb-defra-index | sunrise | sunset | moonrise | moonset | moon_phase | moon_illumination | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Afghanistan | Kabul | 34.52 | 69.18 | Asia/Kabul | 1693301400 | 2023-08-29 14:00 | 28.8 | 83.8 | Sunny | ... | 7.9 | 11.1 | 1 | 1 | 05:24 AM | 06:24 PM | 05:39 PM | 02:48 AM | Waxing Gibbous | 93 |
| 1 | Albania | Tirana | 41.33 | 19.82 | Europe/Tirane | 1693301400 | 2023-08-29 11:30 | 27.0 | 80.6 | Partly cloudy | ... | 28.2 | 29.6 | 2 | 3 | 06:04 AM | 07:19 PM | 06:50 PM | 03:25 AM | Waxing Gibbous | 93 |
| 2 | Algeria | Algiers | 36.76 | 3.05 | Africa/Algiers | 1693301400 | 2023-08-29 10:30 | 28.0 | 82.4 | Partly cloudy | ... | 6.4 | 7.9 | 1 | 1 | 06:16 AM | 07:21 PM | 06:46 PM | 03:50 AM | Waxing Gibbous | 93 |
| 3 | Andorra | Andorra La Vella | 42.50 | 1.52 | Europe/Andorra | 1693301400 | 2023-08-29 11:30 | 10.2 | 50.4 | Sunny | ... | 0.5 | 0.8 | 1 | 1 | 07:16 AM | 08:34 PM | 08:08 PM | 04:38 AM | Waxing Gibbous | 93 |
| 4 | Angola | Luanda | -8.84 | 13.23 | Africa/Luanda | 1693301400 | 2023-08-29 10:30 | 25.0 | 77.0 | Partly cloudy | ... | 139.6 | 203.3 | 4 | 10 | 06:11 AM | 06:06 PM | 04:43 PM | 04:41 AM | Waxing Gibbous | 93 |
5 rows × 41 columns
As the dataset contains 9936 rows of data, lets sample a subset of size 3500 and work on. Ther are no missing values in the dataset.The columnn names are listed below.
Index(['country', 'location_name', 'latitude', 'longitude', 'timezone',
'last_updated_epoch', 'last_updated', 'temperature_celsius',
'temperature_fahrenheit', 'condition_text', 'wind_mph', 'wind_kph',
'wind_degree', 'wind_direction', 'pressure_mb', 'pressure_in',
'precip_mm', 'precip_in', 'humidity', 'cloud', 'feels_like_celsius',
'feels_like_fahrenheit', 'visibility_km', 'visibility_miles',
'uv_index', 'gust_mph', 'gust_kph', 'air_quality_Carbon_Monoxide',
'air_quality_Ozone', 'air_quality_Nitrogen_dioxide',
'air_quality_Sulphur_dioxide', 'air_quality_PM2.5', 'air_quality_PM10',
'air_quality_us-epa-index', 'air_quality_gb-defra-index', 'sunrise',
'sunset', 'moonrise', 'moonset', 'moon_phase', 'moon_illumination'],
dtype='object')
Bar plot of different weather conditions for average wind measured in mph was plotted.
| air_quality_Ozone | air_quality_PM10 | air_quality_Carbon_Monoxide | air_quality_Nitrogen_dioxide | air_quality_Sulphur_dioxide | air_quality_us-epa-index | |
|---|---|---|---|---|---|---|
| air_quality_Ozone | 1.00 | -0.05 | -0.18 | -0.28 | -0.06 | -0.07 |
| air_quality_PM10 | -0.05 | 1.00 | 0.80 | 0.48 | 0.39 | 0.73 |
| air_quality_Carbon_Monoxide | -0.18 | 0.80 | 1.00 | 0.54 | 0.43 | 0.55 |
| air_quality_Nitrogen_dioxide | -0.28 | 0.48 | 0.54 | 1.00 | 0.70 | 0.53 |
| air_quality_Sulphur_dioxide | -0.06 | 0.39 | 0.43 | 0.70 | 1.00 | 0.46 |
| air_quality_us-epa-index | -0.07 | 0.73 | 0.55 | 0.53 | 0.46 | 1.00 |
Lets find the relation between different weather phenomenon. Scatter plot was plotted pairwise with precipitation (inches), wind (mph), humidity (%), pressure (mb) and gust (mph).